Dataset statistics
| Dataset A | Dataset B | |
|---|---|---|
| Number of variables | 12 | 12 |
| Number of observations | 446 | 446 |
| Missing cells | 428 | 429 |
| Missing cells (%) | 8.0% | 8.0% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 45.3 KiB | 45.3 KiB |
| Average record size in memory | 104.0 B | 104.0 B |
Variable types
| Dataset A | Dataset B | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 4 |
| Text | 3 | 3 |
| Dataset A | Dataset B | |
|---|---|---|
Age has 89 (20.0%) missing values | Age has 91 (20.4%) missing values | Missing |
Cabin has 337 (75.6%) missing values | Cabin has 337 (75.6%) missing values | Missing |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 298 (66.8%) zeros | SibSp has 304 (68.2%) zeros | Zeros |
Parch has 341 (76.5%) zeros | Parch has 341 (76.5%) zeros | Zeros |
Fare has 5 (1.1%) zeros | Fare has 7 (1.6%) zeros | Zeros |
Reproduction
| Dataset A | Dataset B | |
|---|---|---|
| Analysis started | 2024-05-07 16:31:49.328736 | 2024-05-07 16:31:53.356932 |
| Analysis finished | 2024-05-07 16:31:53.355822 | 2024-05-07 16:31:57.261271 |
| Duration | 4.03 seconds | 3.9 seconds |
| Software version | ydata-profiling v0.0.dev0 | ydata-profiling v0.0.dev0 |
| Download configuration | config.json | config.json |
PassengerId
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 432.69283 | 446.57848 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 1 | 1 |
| Maximum | 885 | 890 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 1 | 1 |
| 5-th percentile | 40 | 46.25 |
| Q1 | 202.5 | 223.25 |
| median | 415.5 | 447 |
| Q3 | 668.75 | 669.75 |
| 95-th percentile | 847.75 | 857.75 |
| Maximum | 885 | 890 |
| Range | 884 | 889 |
| Interquartile range (IQR) | 466.25 | 446.5 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 260.1043 | 258.7778 |
| Coefficient of variation (CV) | 0.60112921 | 0.57946769 |
| Kurtosis | -1.2163897 | -1.186486 |
| Mean | 432.69283 | 446.57848 |
| Median Absolute Deviation (MAD) | 230.5 | 223.5 |
| Skewness | 0.086421156 | 0.02786659 |
| Sum | 192981 | 199174 |
| Variance | 67654.245 | 66965.948 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 799 | 1 | 0.2% |
| 331 | 1 | 0.2% |
| 637 | 1 | 0.2% |
| 654 | 1 | 0.2% |
| 156 | 1 | 0.2% |
| 79 | 1 | 0.2% |
| 350 | 1 | 0.2% |
| 837 | 1 | 0.2% |
| 712 | 1 | 0.2% |
| 828 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 39 | 1 | 0.2% |
| 360 | 1 | 0.2% |
| 237 | 1 | 0.2% |
| 797 | 1 | 0.2% |
| 391 | 1 | 0.2% |
| 876 | 1 | 0.2% |
| 437 | 1 | 0.2% |
| 887 | 1 | 0.2% |
| 725 | 1 | 0.2% |
| 286 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 14 | 1 | |
| 15 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 20 | 1 | |
| 22 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 6 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 11 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 20 | 1 | |
| 22 | 1 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 9 | 1 | |
| 10 | 1 | |
| 12 | 1 | |
| 14 | 1 | |
| 15 | 1 |
Survived
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 0 | |
|---|---|
| 1 |
| 0 | |
|---|---|
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 2 | 2 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 0 | 0 |
| 2nd row | 0 | 0 |
| 3rd row | 0 | 0 |
| 4th row | 0 | 1 |
| 5th row | 1 | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 274 | |
| 1 | 172 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 274 | |
| 1 | 172 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 274 | |
| 1 | 172 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 274 | |
| 1 | 172 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 274 | |
| 1 | 172 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 272 | |
| 1 | 174 |
| Value | Count | Frequency (%) |
| 0 | 274 | |
| 1 | 172 |
Pclass
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 3 | 3 |
| 2nd row | 3 | 3 |
| 3rd row | 3 | 2 |
| 4th row | 3 | 3 |
| 5th row | 3 | 1 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 251 | |
| 1 | 105 | |
| 2 | 90 | 20.2% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 113 | |
| 2 | 88 | 19.7% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 3 | 251 | |
| 1 | 105 | |
| 2 | 90 | 20.2% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 113 | |
| 2 | 88 | 19.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 251 | |
| 1 | 105 | |
| 2 | 90 | 20.2% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 113 | |
| 2 | 88 | 19.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 251 | |
| 1 | 105 | |
| 2 | 90 | 20.2% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 113 | |
| 2 | 88 | 19.7% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 251 | |
| 1 | 105 | |
| 2 | 90 | 20.2% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 113 | |
| 2 | 88 | 19.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 251 | |
| 1 | 105 | |
| 2 | 90 | 20.2% |
| Value | Count | Frequency (%) |
| 3 | 245 | |
| 1 | 113 | |
| 2 | 88 | 19.7% |
Name
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 82 | 82 |
| Median length | 49 | 48 |
| Mean length | 26.753363 | 27.269058 |
| Min length | 12 | 12 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 11932 | 12162 |
| Distinct characters | 59 | 60 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 446 | 446 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | Ibrahim Shawah, Mr. Yousseff | Vander Planke, Miss. Augusta Maria |
| 2nd row | Skoog, Mr. Wilhelm | Kiernan, Mr. Philip |
| 3rd row | Gustafsson, Mr. Johan Birger | Butler, Mr. Reginald Fenton |
| 4th row | Ekstrom, Mr. Johan | Bing, Mr. Lee |
| 5th row | Chip, Mr. Chang | Frauenthal, Mrs. Henry William (Clara Heinsheimer) |
| Value | Count | Frequency (%) |
| mr | 256 | 14.1% |
| miss | 101 | 5.6% |
| mrs | 63 | 3.5% |
| william | 28 | 1.5% |
| master | 19 | 1.0% |
| henry | 17 | 0.9% |
| james | 15 | 0.8% |
| john | 14 | 0.8% |
| thomas | 13 | 0.7% |
| george | 12 | 0.7% |
| Other values (906) | 1273 |
| Value | Count | Frequency (%) |
| mr | 263 | 14.4% |
| miss | 87 | 4.8% |
| mrs | 67 | 3.7% |
| william | 33 | 1.8% |
| john | 24 | 1.3% |
| henry | 19 | 1.0% |
| master | 17 | 0.9% |
| george | 13 | 0.7% |
| thomas | 11 | 0.6% |
| charles | 10 | 0.5% |
| Other values (918) | 1277 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1366 | 11.4% | |
| r | 931 | 7.8% |
| e | 868 | 7.3% |
| a | 810 | 6.8% |
| s | 679 | 5.7% |
| i | 668 | 5.6% |
| n | 637 | 5.3% |
| M | 568 | 4.8% |
| l | 519 | 4.3% |
| o | 498 | 4.2% |
| Other values (49) | 4388 |
| Value | Count | Frequency (%) |
| 1376 | 11.3% | |
| r | 979 | 8.0% |
| e | 893 | 7.3% |
| a | 827 | 6.8% |
| n | 672 | 5.5% |
| s | 658 | 5.4% |
| i | 654 | 5.4% |
| M | 564 | 4.6% |
| l | 551 | 4.5% |
| o | 526 | 4.3% |
| Other values (50) | 4462 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 11932 |
| Value | Count | Frequency (%) |
| (unknown) | 12162 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1366 | 11.4% | |
| r | 931 | 7.8% |
| e | 868 | 7.3% |
| a | 810 | 6.8% |
| s | 679 | 5.7% |
| i | 668 | 5.6% |
| n | 637 | 5.3% |
| M | 568 | 4.8% |
| l | 519 | 4.3% |
| o | 498 | 4.2% |
| Other values (49) | 4388 |
| Value | Count | Frequency (%) |
| 1376 | 11.3% | |
| r | 979 | 8.0% |
| e | 893 | 7.3% |
| a | 827 | 6.8% |
| n | 672 | 5.5% |
| s | 658 | 5.4% |
| i | 654 | 5.4% |
| M | 564 | 4.6% |
| l | 551 | 4.5% |
| o | 526 | 4.3% |
| Other values (50) | 4462 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 11932 |
| Value | Count | Frequency (%) |
| (unknown) | 12162 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1366 | 11.4% | |
| r | 931 | 7.8% |
| e | 868 | 7.3% |
| a | 810 | 6.8% |
| s | 679 | 5.7% |
| i | 668 | 5.6% |
| n | 637 | 5.3% |
| M | 568 | 4.8% |
| l | 519 | 4.3% |
| o | 498 | 4.2% |
| Other values (49) | 4388 |
| Value | Count | Frequency (%) |
| 1376 | 11.3% | |
| r | 979 | 8.0% |
| e | 893 | 7.3% |
| a | 827 | 6.8% |
| n | 672 | 5.5% |
| s | 658 | 5.4% |
| i | 654 | 5.4% |
| M | 564 | 4.6% |
| l | 551 | 4.5% |
| o | 526 | 4.3% |
| Other values (50) | 4462 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 11932 |
| Value | Count | Frequency (%) |
| (unknown) | 12162 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1366 | 11.4% | |
| r | 931 | 7.8% |
| e | 868 | 7.3% |
| a | 810 | 6.8% |
| s | 679 | 5.7% |
| i | 668 | 5.6% |
| n | 637 | 5.3% |
| M | 568 | 4.8% |
| l | 519 | 4.3% |
| o | 498 | 4.2% |
| Other values (49) | 4388 |
| Value | Count | Frequency (%) |
| 1376 | 11.3% | |
| r | 979 | 8.0% |
| e | 893 | 7.3% |
| a | 827 | 6.8% |
| n | 672 | 5.5% |
| s | 658 | 5.4% |
| i | 654 | 5.4% |
| M | 564 | 4.6% |
| l | 551 | 4.5% |
| o | 526 | 4.3% |
| Other values (50) | 4462 |
Sex
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.7399103 | 4.6995516 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2114 | 2096 |
| Distinct characters | 5 | 5 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | male | female |
| 2nd row | male | male |
| 3rd row | male | male |
| 4th row | male | male |
| 5th row | male | female |
Common Values
| Value | Count | Frequency (%) |
| male | 281 | |
| female | 165 |
| Value | Count | Frequency (%) |
| male | 290 | |
| female | 156 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| male | 281 | |
| female | 165 |
| Value | Count | Frequency (%) |
| male | 290 | |
| female | 156 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 611 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 165 | 7.8% |
| Value | Count | Frequency (%) |
| e | 602 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 156 | 7.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2114 |
| Value | Count | Frequency (%) |
| (unknown) | 2096 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 611 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 165 | 7.8% |
| Value | Count | Frequency (%) |
| e | 602 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 156 | 7.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2114 |
| Value | Count | Frequency (%) |
| (unknown) | 2096 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 611 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 165 | 7.8% |
| Value | Count | Frequency (%) |
| e | 602 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 156 | 7.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2114 |
| Value | Count | Frequency (%) |
| (unknown) | 2096 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 611 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 165 | 7.8% |
| Value | Count | Frequency (%) |
| e | 602 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 156 | 7.4% |
Age
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 74 | 70 |
| Distinct (%) | 20.7% | 19.7% |
| Missing | 89 | 91 |
| Missing (%) | 20.0% | 20.4% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 30.12465 | 29.268789 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.75 | 0.75 |
| Maximum | 80 | 74 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.75 | 0.75 |
| 5-th percentile | 4.8 | 4 |
| Q1 | 21 | 21 |
| median | 29 | 28 |
| Q3 | 39 | 37.5 |
| 95-th percentile | 56.2 | 52.3 |
| Maximum | 80 | 74 |
| Range | 79.25 | 73.25 |
| Interquartile range (IQR) | 18 | 16.5 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 14.566542 | 13.832121 |
| Coefficient of variation (CV) | 0.48354227 | 0.47258947 |
| Kurtosis | 0.26531611 | 0.17139424 |
| Mean | 30.12465 | 29.268789 |
| Median Absolute Deviation (MAD) | 9 | 8 |
| Skewness | 0.37389327 | 0.34414924 |
| Sum | 10754.5 | 10390.42 |
| Variance | 212.18413 | 191.32758 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 25 | 16 | 3.6% |
| 24 | 14 | 3.1% |
| 21 | 14 | 3.1% |
| 18 | 14 | 3.1% |
| 36 | 13 | 2.9% |
| 30 | 12 | 2.7% |
| 22 | 12 | 2.7% |
| 29 | 11 | 2.5% |
| 16 | 10 | 2.2% |
| 26 | 10 | 2.2% |
| Other values (64) | 231 | |
| (Missing) | 89 | 20.0% |
| Value | Count | Frequency (%) |
| 24 | 18 | 4.0% |
| 22 | 17 | 3.8% |
| 30 | 15 | 3.4% |
| 28 | 14 | 3.1% |
| 18 | 12 | 2.7% |
| 36 | 12 | 2.7% |
| 26 | 12 | 2.7% |
| 27 | 12 | 2.7% |
| 25 | 11 | 2.5% |
| 21 | 11 | 2.5% |
| Other values (60) | 221 | |
| (Missing) | 91 |
| Value | Count | Frequency (%) |
| 0.75 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 6 | |
| 2 | 4 | |
| 3 | 2 | 0.4% |
| 4 | 3 | |
| 5 | 2 | 0.4% |
| 6 | 2 | 0.4% |
| 7 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.75 | 2 | 0.4% |
| 0.92 | 1 | 0.2% |
| 1 | 2 | 0.4% |
| 2 | 7 | |
| 3 | 2 | 0.4% |
| 4 | 5 | |
| 5 | 1 | 0.2% |
| 6 | 2 | 0.4% |
| 7 | 1 | 0.2% |
| 8 | 3 |
| Value | Count | Frequency (%) |
| 0.75 | 2 | 0.4% |
| 0.92 | 1 | 0.2% |
| 1 | 2 | 0.4% |
| 2 | 7 | |
| 3 | 2 | 0.4% |
| 4 | 5 | |
| 5 | 1 | 0.2% |
| 6 | 2 | 0.4% |
| 7 | 1 | 0.2% |
| 8 | 3 |
| Value | Count | Frequency (%) |
| 0.75 | 1 | 0.2% |
| 0.83 | 1 | 0.2% |
| 0.92 | 1 | 0.2% |
| 1 | 6 | |
| 2 | 4 | |
| 3 | 2 | 0.4% |
| 4 | 3 | |
| 5 | 2 | 0.4% |
| 6 | 2 | 0.4% |
| 7 | 1 | 0.2% |
SibSp
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.55605381 | 0.56502242 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 8 | 8 |
| Zeros | 298 | 304 |
| Zeros (%) | 66.8% | 68.2% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 2.75 | 3 |
| Maximum | 8 | 8 |
| Range | 8 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 1.1593282 | 1.2082249 |
| Coefficient of variation (CV) | 2.0849209 | 2.1383664 |
| Kurtosis | 18.798185 | 16.374759 |
| Mean | 0.55605381 | 0.56502242 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.8007503 | 3.6172912 |
| Sum | 248 | 252 |
| Variance | 1.3440419 | 1.4598075 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 298 | |
| 1 | 106 | 23.8% |
| 2 | 19 | 4.3% |
| 3 | 9 | 2.0% |
| 4 | 8 | 1.8% |
| 8 | 5 | 1.1% |
| 5 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 304 | |
| 1 | 99 | 22.2% |
| 2 | 16 | 3.6% |
| 3 | 10 | 2.2% |
| 4 | 9 | 2.0% |
| 8 | 5 | 1.1% |
| 5 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 298 | |
| 1 | 106 | 23.8% |
| 2 | 19 | 4.3% |
| 3 | 9 | 2.0% |
| 4 | 8 | 1.8% |
| 5 | 1 | 0.2% |
| 8 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 304 | |
| 1 | 99 | 22.2% |
| 2 | 16 | 3.6% |
| 3 | 10 | 2.2% |
| 4 | 9 | 2.0% |
| 5 | 3 | 0.7% |
| 8 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 304 | |
| 1 | 99 | 22.2% |
| 2 | 16 | 3.6% |
| 3 | 10 | 2.2% |
| 4 | 9 | 2.0% |
| 5 | 3 | 0.7% |
| 8 | 5 | 1.1% |
| Value | Count | Frequency (%) |
| 0 | 298 | |
| 1 | 106 | 23.8% |
| 2 | 19 | 4.3% |
| 3 | 9 | 2.0% |
| 4 | 8 | 1.8% |
| 5 | 1 | 0.2% |
| 8 | 5 | 1.1% |
Parch
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 6 |
| Distinct (%) | 1.6% | 1.3% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.39013453 | 0.38116592 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 6 | 5 |
| Zeros | 341 | 341 |
| Zeros (%) | 76.5% | 76.5% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 0 | 0 |
| 95-th percentile | 2 | 2 |
| Maximum | 6 | 5 |
| Range | 6 | 5 |
| Interquartile range (IQR) | 0 | 0 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.84549911 | 0.79775217 |
| Coefficient of variation (CV) | 2.1671989 | 2.0929263 |
| Kurtosis | 10.275107 | 7.4936325 |
| Mean | 0.39013453 | 0.38116592 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.8577391 | 2.5141442 |
| Sum | 174 | 170 |
| Variance | 0.71486875 | 0.63640853 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 341 | |
| 1 | 57 | 12.8% |
| 2 | 38 | 8.5% |
| 4 | 4 | 0.9% |
| 3 | 3 | 0.7% |
| 5 | 2 | 0.4% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 341 | |
| 1 | 55 | 12.3% |
| 2 | 42 | 9.4% |
| 4 | 3 | 0.7% |
| 3 | 3 | 0.7% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 341 | |
| 1 | 57 | 12.8% |
| 2 | 38 | 8.5% |
| 3 | 3 | 0.7% |
| 4 | 4 | 0.9% |
| 5 | 2 | 0.4% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 341 | |
| 1 | 55 | 12.3% |
| 2 | 42 | 9.4% |
| 3 | 3 | 0.7% |
| 4 | 3 | 0.7% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 341 | |
| 1 | 55 | 12.3% |
| 2 | 42 | 9.4% |
| 3 | 3 | 0.7% |
| 4 | 3 | 0.7% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 341 | |
| 1 | 57 | 12.8% |
| 2 | 38 | 8.5% |
| 3 | 3 | 0.7% |
| 4 | 4 | 0.9% |
| 5 | 2 | 0.4% |
| 6 | 1 | 0.2% |
Ticket
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 376 | 383 |
| Distinct (%) | 84.3% | 85.9% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.7488789 | 6.6345291 |
| Min length | 4 | 3 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 3010 | 2959 |
| Distinct characters | 35 | 35 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 322 | 342 ? |
| Unique (%) | 72.2% | 76.7% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 2685 | 345764 |
| 2nd row | 347088 | 367229 |
| 3rd row | 3101277 | 234686 |
| 4th row | 347061 | 1601 |
| 5th row | 1601 | PC 17611 |
| Value | Count | Frequency (%) |
| pc | 31 | 5.5% |
| c.a | 10 | 1.8% |
| a/5 | 8 | 1.4% |
| ca | 7 | 1.2% |
| w./c | 7 | 1.2% |
| sc/paris | 5 | 0.9% |
| 2343 | 5 | 0.9% |
| f.c.c | 4 | 0.7% |
| ston/o | 4 | 0.7% |
| 2 | 4 | 0.7% |
| Other values (399) | 479 |
| Value | Count | Frequency (%) |
| pc | 26 | 4.7% |
| c.a | 12 | 2.2% |
| a/5 | 11 | 2.0% |
| ca | 8 | 1.4% |
| soton/oq | 7 | 1.3% |
| 2343 | 5 | 0.9% |
| w./c | 5 | 0.9% |
| 1601 | 5 | 0.9% |
| line | 4 | 0.7% |
| 347082 | 4 | 0.7% |
| Other values (403) | 471 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 383 | |
| 1 | 332 | |
| 2 | 289 | |
| 7 | 259 | |
| 4 | 237 | 7.9% |
| 0 | 217 | 7.2% |
| 6 | 209 | 6.9% |
| 5 | 186 | 6.2% |
| 9 | 154 | 5.1% |
| 8 | 146 | 4.9% |
| Other values (25) | 598 |
| Value | Count | Frequency (%) |
| 3 | 357 | |
| 1 | 341 | |
| 2 | 292 | |
| 7 | 244 | |
| 4 | 221 | 7.5% |
| 6 | 212 | 7.2% |
| 5 | 198 | 6.7% |
| 0 | 192 | 6.5% |
| 9 | 175 | 5.9% |
| 8 | 137 | 4.6% |
| Other values (25) | 590 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3010 |
| Value | Count | Frequency (%) |
| (unknown) | 2959 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 383 | |
| 1 | 332 | |
| 2 | 289 | |
| 7 | 259 | |
| 4 | 237 | 7.9% |
| 0 | 217 | 7.2% |
| 6 | 209 | 6.9% |
| 5 | 186 | 6.2% |
| 9 | 154 | 5.1% |
| 8 | 146 | 4.9% |
| Other values (25) | 598 |
| Value | Count | Frequency (%) |
| 3 | 357 | |
| 1 | 341 | |
| 2 | 292 | |
| 7 | 244 | |
| 4 | 221 | 7.5% |
| 6 | 212 | 7.2% |
| 5 | 198 | 6.7% |
| 0 | 192 | 6.5% |
| 9 | 175 | 5.9% |
| 8 | 137 | 4.6% |
| Other values (25) | 590 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3010 |
| Value | Count | Frequency (%) |
| (unknown) | 2959 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 383 | |
| 1 | 332 | |
| 2 | 289 | |
| 7 | 259 | |
| 4 | 237 | 7.9% |
| 0 | 217 | 7.2% |
| 6 | 209 | 6.9% |
| 5 | 186 | 6.2% |
| 9 | 154 | 5.1% |
| 8 | 146 | 4.9% |
| Other values (25) | 598 |
| Value | Count | Frequency (%) |
| 3 | 357 | |
| 1 | 341 | |
| 2 | 292 | |
| 7 | 244 | |
| 4 | 221 | 7.5% |
| 6 | 212 | 7.2% |
| 5 | 198 | 6.7% |
| 0 | 192 | 6.5% |
| 9 | 175 | 5.9% |
| 8 | 137 | 4.6% |
| Other values (25) | 590 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3010 |
| Value | Count | Frequency (%) |
| (unknown) | 2959 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 383 | |
| 1 | 332 | |
| 2 | 289 | |
| 7 | 259 | |
| 4 | 237 | 7.9% |
| 0 | 217 | 7.2% |
| 6 | 209 | 6.9% |
| 5 | 186 | 6.2% |
| 9 | 154 | 5.1% |
| 8 | 146 | 4.9% |
| Other values (25) | 598 |
| Value | Count | Frequency (%) |
| 3 | 357 | |
| 1 | 341 | |
| 2 | 292 | |
| 7 | 244 | |
| 4 | 221 | 7.5% |
| 6 | 212 | 7.2% |
| 5 | 198 | 6.7% |
| 0 | 192 | 6.5% |
| 9 | 175 | 5.9% |
| 8 | 137 | 4.6% |
| Other values (25) | 590 |
Fare
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 173 | 178 |
| Distinct (%) | 38.8% | 39.9% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 32.755781 | 33.830502 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 512.3292 | 512.3292 |
| Zeros | 5 | 7 |
| Zeros (%) | 1.1% | 1.6% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.225 | 7.225 |
| Q1 | 7.9031 | 7.925 |
| median | 13.45835 | 14.45625 |
| Q3 | 31.359375 | 32.875 |
| 95-th percentile | 130.2375 | 120 |
| Maximum | 512.3292 | 512.3292 |
| Range | 512.3292 | 512.3292 |
| Interquartile range (IQR) | 23.456275 | 24.95 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 48.417266 | 49.688015 |
| Coefficient of variation (CV) | 1.4781289 | 1.4687342 |
| Kurtosis | 27.202454 | 24.475593 |
| Mean | 32.755781 | 33.830502 |
| Median Absolute Deviation (MAD) | 5.93545 | 6.92085 |
| Skewness | 4.2338723 | 4.0366934 |
| Sum | 14609.078 | 15088.404 |
| Variance | 2344.2317 | 2468.8988 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 7.8958 | 22 | 4.9% |
| 13 | 19 | 4.3% |
| 8.05 | 19 | 4.3% |
| 7.75 | 16 | 3.6% |
| 26 | 14 | 3.1% |
| 10.5 | 14 | 3.1% |
| 7.925 | 12 | 2.7% |
| 7.775 | 10 | 2.2% |
| 8.6625 | 10 | 2.2% |
| 7.225 | 8 | 1.8% |
| Other values (163) | 302 |
| Value | Count | Frequency (%) |
| 8.05 | 24 | 5.4% |
| 13 | 23 | 5.2% |
| 7.8958 | 21 | 4.7% |
| 26 | 15 | 3.4% |
| 10.5 | 15 | 3.4% |
| 7.75 | 14 | 3.1% |
| 26.55 | 9 | 2.0% |
| 7.225 | 8 | 1.8% |
| 7.775 | 8 | 1.8% |
| 7.2292 | 7 | 1.6% |
| Other values (168) | 302 |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 6.2375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 4 | |
| 7.0542 | 1 | 0.2% |
| 7.1417 | 1 | 0.2% |
| 7.225 | 8 |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 5 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.75 | 2 | 0.4% |
| 6.95 | 1 | 0.2% |
| 7.05 | 3 | |
| 7.0542 | 1 | 0.2% |
| 7.125 | 1 | 0.2% |
| 7.1417 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 7 | |
| 5 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.75 | 2 | 0.4% |
| 6.95 | 1 | 0.2% |
| 7.05 | 3 | |
| 7.0542 | 1 | 0.2% |
| 7.125 | 1 | 0.2% |
| 7.1417 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 5 | |
| 6.2375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 4 | |
| 7.0542 | 1 | 0.2% |
| 7.1417 | 1 | 0.2% |
| 7.225 | 8 |
Cabin
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 87 | 93 |
| Distinct (%) | 79.8% | 85.3% |
| Missing | 337 | 337 |
| Missing (%) | 75.6% | 75.6% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 15 | 15 |
| Median length | 3 | 3 |
| Mean length | 3.6238532 | 3.6238532 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 395 | 395 |
| Distinct characters | 19 | 19 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 68 | 79 ? |
| Unique (%) | 62.4% | 72.5% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | B57 B59 B63 B66 | D9 |
| 2nd row | A32 | B39 |
| 3rd row | D11 | C52 |
| 4th row | E44 | B58 B60 |
| 5th row | C50 | C126 |
| Value | Count | Frequency (%) |
| e101 | 3 | 2.3% |
| f2 | 3 | 2.3% |
| d | 3 | 2.3% |
| f | 3 | 2.3% |
| d35 | 2 | 1.6% |
| e44 | 2 | 1.6% |
| c123 | 2 | 1.6% |
| c78 | 2 | 1.6% |
| c26 | 2 | 1.6% |
| c22 | 2 | 1.6% |
| Other values (89) | 105 |
| Value | Count | Frequency (%) |
| g6 | 3 | 2.3% |
| b98 | 3 | 2.3% |
| b96 | 3 | 2.3% |
| c22 | 2 | 1.6% |
| b49 | 2 | 1.6% |
| b60 | 2 | 1.6% |
| b58 | 2 | 1.6% |
| c78 | 2 | 1.6% |
| c83 | 2 | 1.6% |
| c26 | 2 | 1.6% |
| Other values (96) | 105 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 44 | |
| C | 38 | 9.6% |
| B | 32 | 8.1% |
| 1 | 31 | 7.8% |
| 3 | 30 | 7.6% |
| 6 | 29 | 7.3% |
| 5 | 25 | 6.3% |
| E | 21 | 5.3% |
| 7 | 21 | 5.3% |
| 20 | 5.1% | |
| Other values (9) | 104 |
| Value | Count | Frequency (%) |
| C | 48 | |
| 2 | 38 | 9.6% |
| B | 35 | 8.9% |
| 1 | 35 | 8.9% |
| 6 | 33 | 8.4% |
| 5 | 26 | 6.6% |
| 3 | 23 | 5.8% |
| 8 | 23 | 5.8% |
| 19 | 4.8% | |
| 9 | 19 | 4.8% |
| Other values (9) | 96 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 395 |
| Value | Count | Frequency (%) |
| (unknown) | 395 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 2 | 44 | |
| C | 38 | 9.6% |
| B | 32 | 8.1% |
| 1 | 31 | 7.8% |
| 3 | 30 | 7.6% |
| 6 | 29 | 7.3% |
| 5 | 25 | 6.3% |
| E | 21 | 5.3% |
| 7 | 21 | 5.3% |
| 20 | 5.1% | |
| Other values (9) | 104 |
| Value | Count | Frequency (%) |
| C | 48 | |
| 2 | 38 | 9.6% |
| B | 35 | 8.9% |
| 1 | 35 | 8.9% |
| 6 | 33 | 8.4% |
| 5 | 26 | 6.6% |
| 3 | 23 | 5.8% |
| 8 | 23 | 5.8% |
| 19 | 4.8% | |
| 9 | 19 | 4.8% |
| Other values (9) | 96 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 395 |
| Value | Count | Frequency (%) |
| (unknown) | 395 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 2 | 44 | |
| C | 38 | 9.6% |
| B | 32 | 8.1% |
| 1 | 31 | 7.8% |
| 3 | 30 | 7.6% |
| 6 | 29 | 7.3% |
| 5 | 25 | 6.3% |
| E | 21 | 5.3% |
| 7 | 21 | 5.3% |
| 20 | 5.1% | |
| Other values (9) | 104 |
| Value | Count | Frequency (%) |
| C | 48 | |
| 2 | 38 | 9.6% |
| B | 35 | 8.9% |
| 1 | 35 | 8.9% |
| 6 | 33 | 8.4% |
| 5 | 26 | 6.6% |
| 3 | 23 | 5.8% |
| 8 | 23 | 5.8% |
| 19 | 4.8% | |
| 9 | 19 | 4.8% |
| Other values (9) | 96 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 395 |
| Value | Count | Frequency (%) |
| (unknown) | 395 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 2 | 44 | |
| C | 38 | 9.6% |
| B | 32 | 8.1% |
| 1 | 31 | 7.8% |
| 3 | 30 | 7.6% |
| 6 | 29 | 7.3% |
| 5 | 25 | 6.3% |
| E | 21 | 5.3% |
| 7 | 21 | 5.3% |
| 20 | 5.1% | |
| Other values (9) | 104 |
| Value | Count | Frequency (%) |
| C | 48 | |
| 2 | 38 | 9.6% |
| B | 35 | 8.9% |
| 1 | 35 | 8.9% |
| 6 | 33 | 8.4% |
| 5 | 26 | 6.6% |
| 3 | 23 | 5.8% |
| 8 | 23 | 5.8% |
| 19 | 4.8% | |
| 9 | 19 | 4.8% |
| Other values (9) | 96 |
Embarked
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 2 | 1 |
| Missing (%) | 0.4% | 0.2% |
| Memory size | 7.0 KiB | 7.0 KiB |
| S | |
|---|---|
| C | |
| Q |
| S | |
|---|---|
| C | |
| Q |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 444 | 445 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | C | S |
| 2nd row | S | Q |
| 3rd row | S | S |
| 4th row | S | S |
| 5th row | S | S |
Common Values
| Value | Count | Frequency (%) |
| S | 319 | |
| C | 81 | 18.2% |
| Q | 44 | 9.9% |
| (Missing) | 2 | 0.4% |
| Value | Count | Frequency (%) |
| S | 328 | |
| C | 84 | 18.8% |
| Q | 33 | 7.4% |
| (Missing) | 1 | 0.2% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| s | 319 | |
| c | 81 | 18.2% |
| q | 44 | 9.9% |
| Value | Count | Frequency (%) |
| s | 328 | |
| c | 84 | 18.9% |
| q | 33 | 7.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 319 | |
| C | 81 | 18.2% |
| Q | 44 | 9.9% |
| Value | Count | Frequency (%) |
| S | 328 | |
| C | 84 | 18.9% |
| Q | 33 | 7.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 444 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 319 | |
| C | 81 | 18.2% |
| Q | 44 | 9.9% |
| Value | Count | Frequency (%) |
| S | 328 | |
| C | 84 | 18.9% |
| Q | 33 | 7.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 444 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 319 | |
| C | 81 | 18.2% |
| Q | 44 | 9.9% |
| Value | Count | Frequency (%) |
| S | 328 | |
| C | 84 | 18.9% |
| Q | 33 | 7.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 444 |
| Value | Count | Frequency (%) |
| (unknown) | 445 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 319 | |
| C | 81 | 18.2% |
| Q | 44 | 9.9% |
| Value | Count | Frequency (%) |
| S | 328 | |
| C | 84 | 18.9% |
| Q | 33 | 7.4% |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 798 | 799 | 0 | 3 | Ibrahim Shawah, Mr. Yousseff | male | 30.0 | 0 | 0 | 2685 | 7.2292 | NaN | C |
| 360 | 361 | 0 | 3 | Skoog, Mr. Wilhelm | male | 40.0 | 1 | 4 | 347088 | 27.9000 | NaN | S |
| 392 | 393 | 0 | 3 | Gustafsson, Mr. Johan Birger | male | 28.0 | 2 | 0 | 3101277 | 7.9250 | NaN | S |
| 129 | 130 | 0 | 3 | Ekstrom, Mr. Johan | male | 45.0 | 0 | 0 | 347061 | 6.9750 | NaN | S |
| 838 | 839 | 1 | 3 | Chip, Mr. Chang | male | 32.0 | 0 | 0 | 1601 | 56.4958 | NaN | S |
| 440 | 441 | 1 | 2 | Hart, Mrs. Benjamin (Esther Ada Bloomfield) | female | 45.0 | 1 | 1 | F.C.C. 13529 | 26.2500 | NaN | S |
| 660 | 661 | 1 | 1 | Frauenthal, Dr. Henry William | male | 50.0 | 2 | 0 | PC 17611 | 133.6500 | NaN | S |
| 752 | 753 | 0 | 3 | Vande Velde, Mr. Johannes Joseph | male | 33.0 | 0 | 0 | 345780 | 9.5000 | NaN | S |
| 402 | 403 | 0 | 3 | Jussila, Miss. Mari Aina | female | 21.0 | 1 | 0 | 4137 | 9.8250 | NaN | S |
| 311 | 312 | 1 | 1 | Ryerson, Miss. Emily Borie | female | 18.0 | 2 | 2 | PC 17608 | 262.3750 | B57 B59 B63 B66 | C |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 38 | 39 | 0 | 3 | Vander Planke, Miss. Augusta Maria | female | 18.0 | 2 | 0 | 345764 | 18.0000 | NaN | S |
| 214 | 215 | 0 | 3 | Kiernan, Mr. Philip | male | NaN | 1 | 0 | 367229 | 7.7500 | NaN | Q |
| 666 | 667 | 0 | 2 | Butler, Mr. Reginald Fenton | male | 25.0 | 0 | 0 | 234686 | 13.0000 | NaN | S |
| 74 | 75 | 1 | 3 | Bing, Mr. Lee | male | 32.0 | 0 | 0 | 1601 | 56.4958 | NaN | S |
| 334 | 335 | 1 | 1 | Frauenthal, Mrs. Henry William (Clara Heinsheimer) | female | NaN | 1 | 0 | PC 17611 | 133.6500 | NaN | S |
| 401 | 402 | 0 | 3 | Adams, Mr. John | male | 26.0 | 0 | 0 | 341826 | 8.0500 | NaN | S |
| 40 | 41 | 0 | 3 | Ahlin, Mrs. Johan (Johanna Persdotter Larsson) | female | 40.0 | 1 | 0 | 7546 | 9.4750 | NaN | S |
| 288 | 289 | 1 | 2 | Hosono, Mr. Masabumi | male | 42.0 | 0 | 0 | 237798 | 13.0000 | NaN | S |
| 865 | 866 | 1 | 2 | Bystrom, Mrs. (Karolina) | female | 42.0 | 0 | 0 | 236852 | 13.0000 | NaN | S |
| 695 | 696 | 0 | 2 | Chapman, Mr. Charles Henry | male | 52.0 | 0 | 0 | 248731 | 13.5000 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 305 | 306 | 1 | 1 | Allison, Master. Hudson Trevor | male | 0.92 | 1 | 2 | 113781 | 151.5500 | C22 C26 | S |
| 34 | 35 | 0 | 1 | Meyer, Mr. Edgar Joseph | male | 28.00 | 1 | 0 | PC 17604 | 82.1708 | NaN | C |
| 449 | 450 | 1 | 1 | Peuchen, Major. Arthur Godfrey | male | 52.00 | 0 | 0 | 113786 | 30.5000 | C104 | S |
| 100 | 101 | 0 | 3 | Petranec, Miss. Matilda | female | 28.00 | 0 | 0 | 349245 | 7.8958 | NaN | S |
| 290 | 291 | 1 | 1 | Barber, Miss. Ellen "Nellie" | female | 26.00 | 0 | 0 | 19877 | 78.8500 | NaN | S |
| 469 | 470 | 1 | 3 | Baclini, Miss. Helene Barbara | female | 0.75 | 2 | 1 | 2666 | 19.2583 | NaN | C |
| 69 | 70 | 0 | 3 | Kink, Mr. Vincenz | male | 26.00 | 2 | 0 | 315151 | 8.6625 | NaN | S |
| 497 | 498 | 0 | 3 | Shellard, Mr. Frederick William | male | NaN | 0 | 0 | C.A. 6212 | 15.1000 | NaN | S |
| 137 | 138 | 0 | 1 | Futrelle, Mr. Jacques Heath | male | 37.00 | 1 | 0 | 113803 | 53.1000 | C123 | S |
| 158 | 159 | 0 | 3 | Smiljanic, Mr. Mile | male | NaN | 0 | 0 | 315037 | 8.6625 | NaN | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 535 | 536 | 1 | 2 | Hart, Miss. Eva Miriam | female | 7.0 | 0 | 2 | F.C.C. 13529 | 26.2500 | NaN | S |
| 560 | 561 | 0 | 3 | Morrow, Mr. Thomas Rowan | male | NaN | 0 | 0 | 372622 | 7.7500 | NaN | Q |
| 673 | 674 | 1 | 2 | Wilhelms, Mr. Charles | male | 31.0 | 0 | 0 | 244270 | 13.0000 | NaN | S |
| 870 | 871 | 0 | 3 | Balkic, Mr. Cerin | male | 26.0 | 0 | 0 | 349248 | 7.8958 | NaN | S |
| 709 | 710 | 1 | 3 | Moubarek, Master. Halim Gonios ("William George") | male | NaN | 1 | 1 | 2661 | 15.2458 | NaN | C |
| 302 | 303 | 0 | 3 | Johnson, Mr. William Cahoone Jr | male | 19.0 | 0 | 0 | LINE | 0.0000 | NaN | S |
| 489 | 490 | 1 | 3 | Coutts, Master. Eden Leslie "Neville" | male | 9.0 | 1 | 1 | C.A. 37671 | 15.9000 | NaN | S |
| 872 | 873 | 0 | 1 | Carlsson, Mr. Frans Olof | male | 33.0 | 0 | 0 | 695 | 5.0000 | B51 B53 B55 | S |
| 874 | 875 | 1 | 2 | Abelson, Mrs. Samuel (Hannah Wizosky) | female | 28.0 | 1 | 0 | P/PP 3381 | 24.0000 | NaN | C |
| 728 | 729 | 0 | 2 | Bryhl, Mr. Kurt Arnold Gottfrid | male | 25.0 | 1 | 0 | 236853 | 26.0000 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||